Serveur d'exploration sur Monteverdi

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Authorship Attribution of Texts: A Review

Identifieur interne : 000393 ( Main/Exploration ); précédent : 000392; suivant : 000394

Authorship Attribution of Texts: A Review

Auteurs : B. Malyutov

Source :

RBID : ISTEX:F438C14FA8A87CCBCE3F5DF947E3C1395655768B

Abstract

Abstract: We survey the authorship attribution of documents given some prior stylistic characteristics of the author’s writing extracted from a corpus of known works, e.g., authentication of disputed documents or literary works. Although the pioneering paper based on word length histograms appeared at the very end of the nineteenth century, the resolution power of this and other stylometry approaches is yet to be studied both theoretically and on case studies such that additional information can assist finding the correct attribution. We survey several theoretical approaches including ones approximating the apparently nearly optimal one based on Kolmogorov conditional complexity and some case studies: attributing Shakespeare canon and newly discovered works as well as allegedly M. Twain’s newly-discovered works, Federalist papers binary (Madison vs. Hamilton) discrimination using Naive Bayes and other classifiers, and steganography presence testing. The latter topic is complemented by a sketch of an anagrams ambiguity study based on the Shannon cryptography theory.

Url:
DOI: 10.1007/11889342_20


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Authorship Attribution of Texts: A Review</title>
<author>
<name sortKey="Malyutov, B" sort="Malyutov, B" uniqKey="Malyutov B" first="B." last="Malyutov">B. Malyutov</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:F438C14FA8A87CCBCE3F5DF947E3C1395655768B</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11889342_20</idno>
<idno type="url">https://api.istex.fr/document/F438C14FA8A87CCBCE3F5DF947E3C1395655768B/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000485</idno>
<idno type="wicri:Area/Istex/Curation">000416</idno>
<idno type="wicri:Area/Istex/Checkpoint">000361</idno>
<idno type="wicri:doubleKey">0302-9743:2006:Malyutov B:authorship:attribution:of</idno>
<idno type="wicri:Area/Main/Merge">000412</idno>
<idno type="wicri:Area/Main/Curation">000410</idno>
<idno type="wicri:Area/Main/Exploration">000393</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Authorship Attribution of Texts: A Review</title>
<author>
<name sortKey="Malyutov, B" sort="Malyutov, B" uniqKey="Malyutov B" first="B." last="Malyutov">B. Malyutov</name>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2006</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">F438C14FA8A87CCBCE3F5DF947E3C1395655768B</idno>
<idno type="DOI">10.1007/11889342_20</idno>
<idno type="ChapterID">Chap20</idno>
<idno type="ChapterID">20</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: We survey the authorship attribution of documents given some prior stylistic characteristics of the author’s writing extracted from a corpus of known works, e.g., authentication of disputed documents or literary works. Although the pioneering paper based on word length histograms appeared at the very end of the nineteenth century, the resolution power of this and other stylometry approaches is yet to be studied both theoretically and on case studies such that additional information can assist finding the correct attribution. We survey several theoretical approaches including ones approximating the apparently nearly optimal one based on Kolmogorov conditional complexity and some case studies: attributing Shakespeare canon and newly discovered works as well as allegedly M. Twain’s newly-discovered works, Federalist papers binary (Madison vs. Hamilton) discrimination using Naive Bayes and other classifiers, and steganography presence testing. The latter topic is complemented by a sketch of an anagrams ambiguity study based on the Shannon cryptography theory.</div>
</front>
</TEI>
<affiliations>
<list></list>
<tree>
<noCountry>
<name sortKey="Malyutov, B" sort="Malyutov, B" uniqKey="Malyutov B" first="B." last="Malyutov">B. Malyutov</name>
</noCountry>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Musique/explor/MonteverdiV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000393 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000393 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Musique
   |area=    MonteverdiV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:F438C14FA8A87CCBCE3F5DF947E3C1395655768B
   |texte=   Authorship Attribution of Texts: A Review
}}

Wicri

This area was generated with Dilib version V0.6.21.
Data generation: Mon May 9 21:59:15 2016. Site generation: Mon Feb 12 09:57:54 2024